1 Introduction

1.1 Motivation

The world’s forcibly displaced population hit its record high in 2017. Globally, at the end of 2017, the global refugee population increased by 2.9 million. By the end of the year, 68.5 million individuals were forcibly displaced worldwide as a result of persecution conflict, or generalized violence (https://www.unhcr.org/5b27be547.pdf). Despite the increase in demand for refugee admission and assistance, the United States specifically has taken a drastic turn away from supporting refugees. The number of refugees admitted to the United States has dropped from a recent high of 84,994 in FY 2016 to 22,874 in FY 2018 - the lowest in 40 years since 1977. The current ceiling for refugee admission has also dropped to 45,000, the lowest in the history of the current US resettlement program. Coming at a time when global numbers of refugees have reached record highs, the ratio of refugees admitted to the United States to the number of refugees worldwide has never been lower. For the first time, the US policy towards refugee admission is moving decisively against the trend of the total number of refugees worldwide (https://www.cgdev.org/blog/reflecting-world-refugee-day-trends-and-consequences-us-refugee-policy). The recent years thus mark a significant shift in refugee resettlement in the US, as a result, this report will be examining the refugee admission trend in the US over the past 10 years (2009-2018).

1.2 Background

According to the UNHCR, refugees are defined as those who have been forced to leave their country due to violence, war, or persecution based on their race, religion, nationality, political opinion or particular social group.

Process of refugee resettlement:

  1. The process of refugee resettlement to the U.S. is a lengthy and thorough process that takes approximately two years and involves numerous U.S. governmental agencies

  2. Refugees do not choose the country in which they would like to live. UNHCR, the UN Refugee Agency identifies the most vulnerable refugees for resettlement and then makes recommendations to select countries.

  3. Once a refugee is recommended to the U.S. for settlement, the U.S. government conducts a thorough vetting of each applicant. The process of which takes between 12 to 24 months and includes:
    • Screening by 8 federal agencies including the State Department, Department of Homeland Security and the FBI
    • Six security database checks and biometric security checks screened against U.S. federal databases
    • Medical screening
    • Three in-person interviews with Department of Homeland Security Officers

Under the Refugee Act of 1980, the president sets an annual ceiling for refugee admissions in consultation with Congress. The annual ceiling has varied over the years, from a high of 231,700 in FY 1980 to a prior low of 67,000 in FY 1986. Amid a large exodus of Syrinas from their war-torn country, President Obama raised the refugee ceiling for FY 2016 to 110,000. After taking office, Trump reduced the FY 2017 cap to 50,000, and for FY 2018 set one at a historic low of 45,000. Far fewer refugees, 22,874, were actually resettled in FY 2018.

1.3 Questions

There are currently 25.9 million refugees in the world, indicating the dramatic growth in refugees over the past decade. This led us to question what the refugee resettlement trend has been for the past decade, and delve deeper than just the changes in the numbers of refugees. In order to better visualize the trend of refugee resettlement to the US, this report will be specifically focusing on the top 5 countries (Burma, Iraq, Somalia, Bhutan, Democratic Republic of the Congo) with the highest refugee resettlement population in the US, which accounted for 60.9% of the total refugee arrivals in the US (https://data.newamericaneconomy.org/en/refugee-resettlement-us/).

We are interested in answering the following questions to gain a better understanding of the refugee resettlements in the United States:

  1. What insights can we gain from temporal exploratory data analysis of refugee settlement patterns in the US?
    • Has there been increases/decreases in refugee resettlements population from 2009 - 2018?
    • Is there a correlation between change in refugee resettlement population and major political events that have happened over the 10 years?
  2. What insights can we gain from geographical visualization of refugee settlement patterns in the US over the 10 years? Why might some states have larger refugee settlements than others?

  3. What changes in demographic patterns (i.e., religion, gender, age, etc.) within the refugee population over the 10 years can we visualize? Can we observe any relationship between certain demographics and refugee settlements?
    • Given the number of religions included in the refugee population, we will be focusing on the top 5 religions in the world (Christianity, Islam, Hinduism, Buddhism, Sikhism). In Iraq, they separate the Muslim population into three categories: Muslim, Muslim Shiite, and Muslim Suni. For the purposes of this analysis, we will combine them as one (https://thecountriesof.com/top-5-largest-religions-in-the-world/).

2 Data Sources

We first collected data from RPC (Refugee Processing Center), that provides refugee arrival information by state and nationality, by destination and nationality, by nationality and religion, and by demographic profile.

We can select the time frame, nationality. Since the RPC website does not allow for faceting by year, we had to download the files year by year, and clean the data into the format we want for data analysis.

3 Data Transformation

3.1 Cleaning the Data in R

  1. From the website, we were able to download ‘.xlsx’ files. Raw files can be found here.
  2. Wrote two functions to clean Excel sheet for a given year.
    1. clean_arrival to clean the Excel files for all refugee resettlements for each state.
    2. clean_demographics to clean the Excel files for demographic information for refugees from specific countries (namely Bhutan, Burma, DRC, Iraq, and Somalia).
  3. Wrote another function, combine_files, to combine each year’s Excel file into one.
  4. Saved these as csv files and uploaded to GitHub to easily access.

3.2 Cleaned Data Format

After cleaning the data, we have six ‘.csv’ files that can be found here.

  1. all_arrivals.csv: The total number of refugee resettlements to each of the 50 states in the US from 2009-2018. All raw files to make this file can be found here.
State Cases Inds Year
California 5524 11512 2009
Texas 3638 8826 2009
New York 2013 5003 2009
Arizona 1952 4543 2009
Florida 1834 4196 2009
Michigan 1602 3460 2009
  1. age_group.csv:
Age.Group Male Female Total country Year
Under 14 1559 1627 3186 Bhutan 2009
Age 14 to 20 1258 1310 2568 Bhutan 2009
Age 21 to 30 1823 1927 3750 Bhutan 2009
Age 31 to 40 1110 1124 2234 Bhutan 2009
Age 41 to 50 726 737 1463 Bhutan 2009
Age 51 to 64 583 626 1209 Bhutan 2009
  1. education.csv:
Education Male Female Total country Year
Bio Data not Complete 2657 1846 4503 Bhutan 2009
Graduate School 21 144 165 Bhutan 2009
Intermediate 509 496 1005 Bhutan 2009
Kindergarten 123 127 250 Bhutan 2009
NONE 100 42 142 Bhutan 2009
Pre-University 1 1 2 Bhutan 2009
  1. ethnicity.csv:
Ethnicity Male Female Total country Year
Lhotsampa 7373 7677 15050 Bhutan 2009
Other 12 15 27 Bhutan 2009
Lhotsampa 5842 5881 11723 Bhutan 2010
Other 3 3 6 Bhutan 2010
Lhotsampa 7314 7410 14724 Bhutan 2011
Other 4 7 11 Bhutan 2011
  1. native_language.csv:
Native.Language Male Female Total country Year
Bio Data not Complete 3 4 7 Bhutan 2009
Dzongka 0 1 1 Bhutan 2009
English 2 1 3 Bhutan 2009
Hindi 1 0 1 Bhutan 2009
Marathi 1 0 1 Bhutan 2009
Napoletano-Calabrese 0 1 1 Bhutan 2009
  1. religion.csv:
Religion Male Female Total country Year
Buddhist 748 853 1601 Bhutan 2009
Christian 534 518 1052 Bhutan 2009
Hindu 5798 5993 11791 Bhutan 2009
Kirat 305 328 633 Bhutan 2009
Buddhist 925 910 1835 Bhutan 2010
Christian 468 453 921 Bhutan 2010

3.3 Transforming the Data

TODO: WRITE STUFF

4 Missing Values

The datasets from RPC (Refugee Processing Center) did not contain any missing values. However, we also noticed that our data contained a single row called “Unknown State”. Since this would not be plotted in our maps, we decided that it would be better to remove the data. Additionally, when we converted the State column to factors, there were 56. The extra 6 states are:

  • American Samoa
  • District of Columbia
  • Guam
  • Puerto Rico
  • Unknown State
  • Virgin Islands

We removed these rows since we are just curious about the fifty states.

In Education data, we observed the proportion of missing data compared to the total data. For example, Bhutan has more than 30% of missing data in education.

5 Results

5.1 Temporal Analysis

We take a closer look at the top 5 countries that have refugees resettled in the USA (https://data.newamericaneconomy.org/en/refugee-resettlement-us/).

5.2 Geographical Analysis

5.3 Demographic Analysis

5.3.1 Religion

The top 5 religions in the world are: Christianity, Islam, Hinduism, Buddhism, Sikhism (https://thecountriesof.com/top-5-largest-religions-in-the-world/). In Iraq, they separate the Muslim population into three categories: Muslim, Muslim Shiite, and Muslim Suni. For the purposes of this analysis, we will combine them as one.

5.3.2 Education Level

5.3.3 Age Group

DRC and Somalia have the highest proportions of refugees who are under 14.

6 Interactive Component

6.1 Geophics

6.2 Demographics: Religion

## 
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
## 
##     between, first, last
## The following object is masked from 'package:purrr':
## 
##     transpose

# Conclusion

TODO: WRITE STUFF